Distribution-Aware Compressed Full-Text Indexes
نویسندگان
چکیده
منابع مشابه
Improved Compressed Indexes for Full-Text Document Retrieval
We give new space/time tradeoffs for compressed indexes that answer document retrieval queries on general sequences. On a collection of D documents of total length n, current approaches require at least |CSA| + O(n lgD lg lgD ) or 2|CSA| + o(n) bits of space, where CSA is a full-text index. Using monotone minimum perfect hash functions, we give new algorithms for document listing with frequenci...
متن کاملCompressed Text Indexes with Fast Locate
Compressed text (self-)indexes have matured up to a point where they can replace a text by a data structure that requires less space and, in addition to giving access to arbitrary text passages, support indexed text searches. At this point those indexes are competitive with traditional text indexes (which are very large) for counting the number of occurrences of a pattern in the text. Yet, they...
متن کاملImproving Text Indexes Using Compressed Permutations
Any sorting algorithm in the comparison model defines an encoding scheme for permutations. As adaptive sorting algorithms perform o(n lg n) comparisons on restricted classes of permutations, each defines one or more compression schemes for permutations. In the case of the compression schemes inspired by Adaptive Merge Sort, a small amount of additional data allows to support in good time the ac...
متن کاملComputing Matching Statistics and Maximal Exact Matches on Compressed Full-Text Indexes
Exact string matching is a problem that computer programmers face on a regular basis, and full-text indexes like the suffix tree or the suffix array provide fast string search over large texts. In the last decade, research on compressed indexes has flourished because the main problem in large-scale applications is the space consumption of the index. Nowadays, the most successful compressed inde...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Algorithmica
سال: 2013
ISSN: 0178-4617,1432-0541
DOI: 10.1007/s00453-013-9782-3